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Computing a Function of Correlated Sources 

Milad Sefidgaran and Asian Tchamkerten 



Abstract 

A receiver wants to compute a function / of two correlated sources X and Y and side information Z. What 
is the minimum number of bits that needs to be communicated by each transmitter? 
,-H In this paper, we derive inner and outer bounds to the rate region of this problem which coincide in the cases 

o 

^vq where / is partially invertible and where the sources are independent given the side information. From the former 



case we recover the Slepian-Wolf rate region and from the latter case we recover Orlitsky and Roche's single 
source result. 

I. Introduction 



zn Given two sources X and Y separately observed by two transmitters, we consider the problem of finding 

the minimum number of bits that needs to be sent by each transmitter to a common receiver, who has 

> acce. ,o Side informa.ion Z. and „a„.. ,o compu.e a given function /(X, y, Z) wi,h Mgh probabili.yQ 

O The first result on this problem was obtained by Korner and Marton [J8J who derived the rate region 

OO 

^ for the case where / is the sum modulo two of binary X and Y and where p{x,y) is symmetric (no 

l> 

O side information is available at the receiver). Interestingly, this result came before Orlitsky and Roche's 



general result for the single source case [13], which provides a closed form expression on the minimum 
number of bits needed to be transmitted to compute f{X, Z) at the receiver, for arbitrary / and p(x, 2;)n 



> 

X 

^ However, the Korner and Marton's arguments appear to be difficult to generalize to other functions and 
probability distributions (for an extension of [8] to sum modulo p and symmetric distributions see [jSl). 

More recently, Doshi, Shah, and Medard flU derived conditions under which a rate pair can be achieved 
for fixed code length and error probability. These conditions do not, however, provide a single letter 
characterization for the rate region. 
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'Alternatively, zero error probability has been variously investigated (see, e.g., fTl, fill, fl21, fl81, f211). 

^Their result has been generalized for two round communication |13|, and K round communication |9| in a point-to-point channel. Also, 
coding schemes and converses established in 1 13] have been used in other network configurations, such as cascade networks (31. 1171 . 



A more general setting has been investigated by Nazer and Gastpar [UOI . who considered the problem of 
function computation over a multiple access channel, thereby introducing potential interference between 
transmitters. 

Function computation has also been studied in more general networks, such as in the context of network 
coding Q and decentralized decision making and computation [jTSl . 

In this paper first we provide an inner bound and an outer bound for partially invertible functions, e.g., 
when X or F is a function of both f(X,Y,Z) and Z. These bounds are tight in the case where / is 
partially invertible with respect to one of the sources. As a corollary, we recover the Slepian-Wolf rate 
region, which corresponds to the case where / is invertible with respect to both sources. From the outer 
bound for partially invertible functions, we derive a general outer bound which is tight with the inner 
bound when the sources are independent given the side information. As a corollary, we recover the rate 
region for a single source [fT3l . Finally, a second general outer bound is derived using results from rate 
distortion for correlated sources. In general, these two outer bounds can't be derived from each other. 

For a single source X and side information Z, the minimum number of bits needed for computing 
a function f{x,z) is the solution of an optimization problem defined over the set of all independent 
sets with respect to a characteristic graph defined by X, Z, and /. Indeed, Orlitsky and Roche showed 
that, for a single source, allowing for multisets of independent sets doesn't yield any improvement on 
achievable rates (see proof of [1131 Theorem 2]). In contrast, our inner and outer bounds are the solutions to 
optimization problems defined over multisets of all independent sets with respect to similar characteristic 
graphs. Allowing for multisets may indeed increase the set of achievable rate pairs. This is shown via 
simulation in an example of a partially invertible function where inner and outer bounds are tight. 

An outline of the paper is as follows. In Section |Il] we formally state the problem and provide some 



background material and definitions. Section III contains our results, and Section IV is devoted to the 
proofs. 

II. Problem Statement and Preliminaries 

Let X, y, Z, and J^ be finite sets, and f : XxyxZ ^>- J'. Let {{xi, yi, Zi)}°Z^ be independent instances 
of random variables (X, Y, Z) taking values over X xy x Z and distributed according to p{x, y, z). 

Definition 1 (Code). A (n, Rx, Ry) code consists of two encoding functions 



and a decoding function 

ip: {1,2,. .,2"^^} X {1,2, ..,2"^^} x Z" ^ J^" . 
The error probability of a code is defined as 

P(^(^x(X), ypy (Y), Z) ^ /(X, Y, Z)), 

def 

where X = Xi, . . . , X„ and 

/(X,Y,Z)'=l'{/(Xi,Fi,Zi),...,/(X„,y„,Z„)}. 

Definition 2 (Rate Region). A rate pair {Rx, Ry) is achievable if, for any e > and all n large enough, 
there exists a (n, i?x, -Ry) code whose error probability is no larger than e. The rate region is the closure 
of the set of achievable {Rx-, Ry)- 

The problem we consider in this paper is to characterize the rate region for given / and p{x, y, z). 
Below we remind definitions and properties of conditional characteristic graphs [flSll . 0, which play 
a key role in coding for computing. 

Definition 3 (Conditional Characteristic Graph). Given {X,Y) ~ p{x,y) and f{X,Y), the conditional 
characteristic graph Gx\y of X given Y is the (undirected) graph whose vertex set is X and whose edge 
set E{Gx\y) consists of the set of all {xi,Xj) for which there exists y E y such that 

i. p{xi,y) ■p{xj,y) > 0, 
ii- f{xi,y) ^ f{xj,y). 

Notation. Given two random variables X and V, where X ranges over X and V over subsets of A'jjwe 
write X eV whenever P{X eV) = 1. 

Recall that an independent set of a graph G is a subset of vertices no two of which are connected. A 
maximal independent set is an independent set that is not included in any other independent set. The set 
of independent sets of G and the set of maximal independent sets of G are denoted by T{G) and T*(G), 
respectively. 

Given a finite set S, we use M(iS) to denote the collection of all multisets of S. (Recall that a multiset 
of a set 5 is a collection of elements from S possibly with repetitions, e.g., if 5 = {0, 1}, then {0, 1, 1} 
is a multiset.) 

^I.e., a sample of V is a subset of X. 




(a) 



(b) 



(c) 



Fig. 1. (a) Gx\Y and Gy\x, (b) and (c) Gx\ 



Definition 4 (Conditional Graph Entropy [fT3l ). The conditional entropy of a graph is defined aqj 



HgxAX\Y) 



def 



min I{V]X\Y)= min /(F;X|r). 

y-x-y v-x-Y 

xeVaTiGxw) ^eVeM(r(Gx|y)) 



The second equality in the above expression was established in [|T3ll . 

We now extend the definition of conditional characteristic graph to allow conditioning on variables that 
take values over independent sets. 

Definition 5 (Generalized Conditional Characteristic Graph). Given {X,Y,W, Z) ~ p(x,y,w, z) and 
f{X,Y,Z) such that Y e W E r(G'y|x,z)j^ let fY{x,w,z) = f{x,y,z) for x e X, z E Z, y e w E 
r(Gy|x,z), and p{x, y, w, z) > 0. The generalized conditional characteristic graph of X given W and Z, 
denoted by Gx\w,z, is the conditional characteristic graph of X given (W, Z) with respect to the marginal 
distribution p{x, w, z) and /y(X, W, Z). 

Example 1. Let X and Y to be random variables defined over the alphabets X and y, respectively, 
with X = y = {1,2,3,4}, and with probability distribution p(X = Y) = and uniform over the pairs 
(i,j) E X X y that i ^ j. The receiver wants to decide whether X > Y or Y > X, i.e., compute the 
function f{X, Y) defined as 



fix,y) 



0, if a; < y, 

1, if a; > y. 



(1) 



Fig. 1(a) depicts Gx\y which is equal to Gy\x by symmetry. Hence, 



TiGxiY) = r{Gy\x) = {{1}, {2}, {3}, {4}, {1, 2}, {2, 3}, {3, 4}}. 



T*iGxiY) = r*(Gy|x) = {{1, 2}, {2, 3}, {3, 4}}. 

''We use the notation U — V ~ W whienever random variables (U, V, W) form a Markov ciiain. 
^By definition r{GY\x,z) = ^{Gy\(x,z))- 



An example of random variable W that satisfies 



YeWe T{Gx\y] 



(2) 



is one whose support set is 



W = {{1},{2},{3},{4},{1,2}}. 



For such a W, the generalized conditional characteristic graph Gx\w is depicted in Fig. 1(b) and we 
have 

^(Gxiw) = {{!}, {2}, {3}, {4}, {2, 3}, {3, 4}}. 

Another example of random variable W that satisfies ([2]) is one whose support set is 



W = {{2}, {4}, {1,2}, {2, 3}}. 



For such a ly, the generalized conditional characteristic graph Gx\w is depicted in Fig. 1(c) and we have 



Note that, in general, 



whenever 



r(Gx|p^) = {{l},{2},{3},{4},{3,4}}. 



E{Gx\Y,z) ^ E{Gx\w,z) 



YeWe r{Gx\Y,z)- 

The following lemma provides sufficient conditions under which generalized conditional characteristic 
graph and conditional characteristic graph are the same, i.e., for which 

E{Gx\Y,z) = E{Gx\w,z)- 
Lemma 1. Given (y,X,Y,Z) r^ p{v,x,y,z) and f{X,Y,Z), we have 



Gy\v,z — Gy\x,z, 



for all V such that X eV E T{Gx\y,z), in each of the following cases: 

a. p{x, y,z) > for all {x,y,z) E X x y x Z; 

b. Gx\Y,z is a complete graph or equivalently T{Gx\y,z) consists only of singletons; 



c. X and Y are independent given Z. 

We use the following definition in the Analysis and Appendix sections. 

Definition 6 (Support set of a random variable). Given {V, X) ~ p(t>, x), the support set of V with respect 
to X is the set valued random variable 

Sx{V) = {x:p{V,x)>0} 

whenever there is a one-to-one correspondence between v and {x : p{v,x) > 0}. 

If V and {x : p{v,x) > 0} are not in one-to-one correspondence, SxiV) is defined as follows. First, 
label {x : p{v,x) > 0} with different indices for all v's. SxiV) is then defined as in the one-to-one 
correspondence case, but with respect to this labeling. 

Note that, by definition, V and SxiV) are in one-to-one correspondence. 

III. Results 

Our results are often stated in terms of certain random variables V and W which can usefully be 
interpreted as auxiliaries used to construct the codebooks for transmitter X and transmitter Y, respectively. 
This interpretation is consistent with the proofs of the results. 

Propositions [T] provides a general inner bound to the rate region: 

Proposition 1 (Inner bound). (i?x,-Ry) is achievable whenever 

Rx>IiV;X\W,Z), 
RY>IiY;W\V,Z), 
Rx + Ry> HV] X\Z) + /(F; W\V, Z), 

for some V and W that satisfy 

V-X-iY,W,Z), 

iV,X,Z)-Y-W, 

and either 

X^V ^ MiViGxw^z)) 

FGW^GM(r(G'y|y,z)), 



or, equivalently, 

Y^We M{V{Gy\x,z)) 

XeVe M{T{Gx\w,z)) . 

Note that when there is no side information at the decoder, i.e., when Z is a constant, the two Markov 
chain constraints in Proposition [T] are equivalent to the single long Markov chain 

V -X-Y -W, 

which imply that the sum rate inequality of Proposition [T] becomes 

Rx + Ry> I{X, Y- V, W) . 

The next result provides an outer bound to the rate region when the function is partially invertible with 
respect to X (with respect to Y, respectively), i.e., when X (Y, respectively) is a deterministic function 
of both /(X, Y, Z) and Z. Interestingly, this bound implies a general outer bound to the rate region (see 
Corollary [2] below). 

Proposition 2 (Outer Bound - Partially Invertible Function). If f is partially invertible with respect to 
X, then {Rx,Ry) satisfies 

Rx>H{X\W,Z), 
Ry>I{Y;W\X,Z), 
Rx + Ry> H{X\Z) + I{Y- W\X, Z), 



for some W that satisfies 



{X,Z)-Y-W, 

Y e W e W c m{T{Gy\x,z)), 



with 



m< \y\ + 2. 

Propositions [T] with V = X together with Proposition [2] yields the following result: 
Theorem 1 (Rate Region - Partially Invertible Function). If f is partially invertible with respect to X, 



then the rate region is the closure of rate pairs {Rx,Ry) that satisfy the conditions of Propositions^ 
In Section IV we provide an alternative proof for Theorem [1] using the canonical theory developped 



in [|6l. This alternative proof, however, doesn't establish the cardinalty bound |W| < |3^| + 2. 

Example 2. Consider the situation with no side information given by /(x, y) = (—1)^ ■ x, with X = y 
{0,1,2}, and 



p(x,y) 



.21 .03 .12 
.06 .15 .16 
.03 .12 .12 



Since f{X,Y) is partially invertible with respect to X, we can use Theorem [Tito numerically evaluate 
the rate region. The obtained region is given by the union of the three shaded areas in Fig. |2] These areas 
are discussed later, after Example |4} 

To numerically evaluate the rate region, we would need to consider the set of all conditional distributions 

p{w\y), y ey, w e M(r(G'y|x)). Since |W| < 5, M(r(G'y|x)) consists of multisets of 

r(Gy|x) = {{0},{l},{2},{0,2}} 

whose cardinalities are bounded by 5. 

However, as we now show, among all possible 4P = 1024 multisets with cardinality at most 5, 
considering just the multiset {{1}, {0, 2}, {0, 2}, {0, 2}, {0, 2}} gives the rate region. 

Consider a multiset with cardinality at most 5. 

1. If the multiset does not contain sample {1}, then the condition Yl piw\Y = 1) = 1, hence the 

wew 
condition Y E W, cannot be satisfied. Therefore this multiset is not admissible, and we can ignore 

it. 

2. If the multiset contains two samples wi = {1} and W2 = {1} with conditional probabilities p{wi\Y = 
1) and p(w2\Y = 1), respectively, replacing them by one sample w = {1} whose conditional 
probability is p{w\Y = 1) = p(wi\Y = 1) + p(w2\Y = 1), gives the same terms H{X\W) and 
I{Y; W\X), hence the same rate pairs. Therefore, without loss of optimality we can consider only 
multisets which contain a unique sample of {1}. 

3. If the multiset contains a sample wi = {0} with arbitrary conditional probability p{wi\Y = 0), 
replacing it with sample W2 = {0, 2} whose conditional probabilities are p(w2\Y = 0) = p{wi\Y = 0) 
and p(w2\Y = 2) = gives the same rate pairs. (The same argument holds for a sample Wi = {2}). 
From 1., 2., and 3., multisets with one sample of {1} and multiple copies of {0,2} gives the rate 




Fig. 2. Example of a rate region for a partially invertible function. 

region. 
4. If the multiset has cardinality A; < 5, adding 5 — k samples {0, 2} with zero conditional probabilities, 
gives the same rate pairs. 

It follows that the rate region can be obtained by considering the unique multiset 

{^1 = {1},W2 = {0,2}, W3 = {0,2},W4 = {0,2}, w, = {0,2}} 
and by optimizing over the conditional probabilities {p{w\y)} that satisfy 

p{w,\Y = l) = l, 
p{wi\Y = j) = 0,je{0,2}, 



J2p{w,\y = 0) = 1, 

1=2 
5 

J]pK|F = 2) = 1, 

i=2 

p{w,\Y = 1)= 0,1 e {2,3,4,5}. 



Notice that this optimization has only six degrees of freedom. 

When / is invertible, i.e., when (X, Y) is a function of both f{X, Y, Z) and Z, Theorem [T] reduces to 
the Slepian-Wolf rate region [ , 14] : 

Corollary 1 (Rate Region - Invertible Function). If f is invertible, then the rate region is the closure of 



10 

rate pairs {Rx,Ry) such that 

Rx>H{X\Y,Z), 

Ry >H{Y\X,Z), 

Rx + Ry>H{X,Y\Z). 

Example 3. Let X = y = {1, 2, 3, 5, 7}, Z = {1,2, 3}, let p{x, y, z) be such that 

p{x,y,z) ■p{y,x,z) = 0, 

for any x, y, z with x ^ y, and let 

f{x,y,z) = x-y- z. 

Since f{X, Y, Z) is invertible, the rate region is the Slepian-Wolf rate region given by Corollary [T] 

Proof of Corollary [7|- To deduce Corollary [T] from Theorem [T| suppose / is invertible and suppose 
{X,Y,W,Z) satisfy the conditions of Proposition [5] for given {Rx,Ry)- Note first that H{X\W,Z) > 
H(X\Y,Z) by Markovity, hence the first inequality in Proposition [2] gives 

Rx>H{X\Y,Z). (3) 

Further, knowledge of (W, X, Z) implies knowledge of f{X, Y, Z), by property of W. Hence, 

H{Y\W,X,Z)=H{Y\W,X,Z,f{X,Y,Z)) = 

since the function is invertible. Therefore, the second and third inequalities in Proposition |2] yield 

Ry>H{Y\X,Z) 
Rx + Ry>H{X,Y\Z). (4) 

Since the bound given by inequalities (|3]) and (|4]) is achievable by letting W = Y in Proposition |2} 
Corollary [T] follows. ■ 

The following result is a consequence of Proposition |2] 
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Corollary 2 (General Outer Bound 1). If {Rx,Ry) is achievable then 

Rx + RY>HG,^y,,{X,Y\Z). 

Proof of Corollary^ Corollary [2] is obtained by reducing the multi-source computation problem to 
a single source computation problem, and, as such, can also be derived from [[131 Theorem 1]. 

To obtain Corollary |2] from Proposition |2} we proceed as follows. For the first inequality in Corollary |2} 
note that if {Rx, Ry) is achievable when computing /(X, F, Z) with side information Z, then (0, Ry) is 
achievable when computing the function 

^(x,y,z) = (/(x,F,z),x) 

with side information (X, Z). Since g{X, Y, Z) is partially invertible with respect to X, from Proposition |2] 
we get 

Ry>I{Y-W\X,Z), 

where W satisfies 

{X,Z)-Y-W, 

Y eW e M{T{Gy\x,z)) ■ 

The second inequality in Corollary [2] then follows from Definition |4j The first inequality in Corollary [2] 
is obtained similarly, by swapping the roles of X and Y . 

Finally, if (Rx,Ry) is achievable with side information Z, then Rx + Ry is clearly achievable for a 
single source (X, Y) and side information Z. By applying the first inequality in Corollary [2] to this single 
source one deduces the third inequality. ■ 

The inner and outer bounds given by Proposition [T] and Corollary |2] are tight for independent sources, 
hence also for the single source computation problerqjfor which we recover llT3l Theorem 1]. 

Corollary 3 (Rate Region - Independent Sources). If X and Y are independent given Z, the rate region 

*A single source can be seen as two sources with one of them being constant. 
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is the closure of rate pairs {Rx,Ry) such that 

Hence, ifY is constant, Rx is achievable if and only if Rx > Hc^^^iXlZ). 

Proof of Corollary |3j- For the converse of Corollary [3| note that the two inequalities in the corollary 
correspond to the the first two inequalities of Corollary |2] 

For achievability, suppose V and W satisfy the conditions of Proposition [T| i.e., 

XeVe M{T{Gx\Y,z)) 
YeWe m{T{Gyiv,z)) , 

and 

V-X-{Y,W,Z), 

{V,X,Z)-Y -W. 

From these two Markov chains and the fact that X and Y are independent given Z, we deduce the long 
Markov chain 

V -X- Z -Y -W. 



It then follows that 



and 



I{V;X\W,Z) = I{V]X\Y,Z) 



I{Y- W\V, Z) = I(Y; W\X, Z) . 
Using Proposition [T] we deduce that the rate pair {Rx-, Ry) given by 

Rx = I{V-X\Y,Z) 

and 

Ry = I{Y-W\X,Z) 



13 

is achievable. Now, since X and Y are independent given Z, Gy\v,z = Gy\x,z by Claim c. of Lemma [T] 

This allows to minimize the above two mutual information terms separately, which shows that the rate 

pair 

Rx= min I(V;X\Y,Z) 

xeVeM{r(Gx|y,z)) 

V-X-(Y,Z) 

Ry= min I(Y;W\X,Z) 

YeW£Mir(GYix,z)) 

W-Y-{X,Z) 

is achievable (Notice that I(y;X\Y,Z) is a function of the joint distribution p(v,x,y,z) only, thus the 
minimization constraint V — X — (Y, Z , W) reduces to V — X — {Y, Z) . A similar comment applies to 
the minimization of I{Y; W\X, Z).) The result then follows from Definition |4| ■ 



Example 4. Let Z E {1,2,3}, let U and V be independent uniform random variables over { — 1, 0, 1} and 
{0, 1, 2}, respectively, and let X = Z + U and Y = Z + V. The receiver wants to decide if X is equal 
to Y, i.e., compute the function f{X,Y) defined as 

f{x,y)=l (5) 

1^1, if x = y. 

Since X and Y are independent given Z, the rate region is given by Corollary |3| It can be checked that 

r*(G'x|y,z) = {{0,2},{0,3},{0,l,4}} 

r*(Gy|x,z) = {{2,5},{3,5},{l,4,5}}, 

and a numerical evaluation of conditional graph entropy gives 

H{Gxiy,z) = H{Gyix,z)^1.'28. 

Hence the rate region is given by the set of rate pairs satisfying 

Rx > 1.28, 
Ry > 1.28. 

For a single source, Orlitsky and Roche [fT3l showed that Hcxtzi-^l^) (^^^ Definition |4]) is achieved 
by some V taking values over maximal independent sets T*(Gx\z)- In contrast, for two sources, the 
restriction to maximal independent sets may induce a loss of optimality. Fig. [2] in Example |2] shows the 
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rate region for a partially invertible function, when restricting V and W to be over maximally independent 
sets (gray area), all independent sets (gray and light areas), and multisets of independent sets (union of 
gray, light gray, and black areas). Denoting these areas by 7^(r*), 7^(r), and 7^(M(r))n respectively, we 
thus numerically get the strict sets inclusions 

7^(r) c7^(^) c7^(M(^)). 

Numerical evidence suggests that the small difference between 7^(r) and 7^(M(r)) is unrelated to the 
specificity of the probability distribution p(x, y) in the example (i.e., by choosing other distributions the 
difference between 7^(r) and 7^(M(r)) remains small). 

We now provide a second rate region outer bound which is derived using results from rate distortion 
for correlated sources [fT6ll : 

Proposition 3 (General Outer Bound 2). If {Rx,Ry) is achievable, then 

Rx>I{X,Y;V\W,Z), 

Ry>I{X,Y;W\V,Z), 

Rx + Ry>I{X,Y;V,W\Z), 

for some random variables (V, W) that satisfy H{f{X, Y, Z)\V, W, Z) = and 

V-X-{Y,Z), 

(X, Z)-Y -W. 

Using Lemma |3] in the Appendix, one can show that had the above two Markov chain constraints been 

V -X-{Y,W, Z) 

{y,x,z)-Y-w, 

the outer bound given by Proposition [3] would be equal to the inner bound given by Proposition [T] However, 
since this inner bound is not tight in general, as we show in the coming subsection, these hypothetical 
Markov chains don't hold in general. 

Finally note that it may be difficult to extract an explicit outer bound from Proposition [3] since it is 

^With |M(r)| < 5. 
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implicitly characterized by random variables {V, W) that should (in part) satisfy 

H{f{X,Y,Z)\V,W,Z) = 0. 

Gap between upper and lower bounds 

In general, Proposition [T] and Corollary |2] need not be tight, such as for the sum modulo 2 of binary 
X and Y (no side information) with symmetric distribution, i.e.. 



p{x,y) 



p i-p 

2 2 

2 2 



Assuming p E (0, 1), T{Gx\y) and T(Gy\x) both consists of singletons. This implies that the achievable 
region given by Proposition [1] reduces to 

Rx > H{X\W), 
Ry > H{Y\V), 
Rx + Ry>H{X) + H{Y\V). (6) 



smce 



H{X\V) = H{Y\W) = 
for all {V, X, Y, W) that satisfy 

xeve M(r(Gx|y)), 
Y eW e m{T{Gy\v))- 

Note that since r(Gy|v') (which is equal to T{Gy\x) according to Lemma fTj Claim a.) consists of 
singletons, 

H{X\W) = H{X\Y, W) < H{X\Y). (7) 

Furthermore, because of the Markov chain constraint 

{V,X)-Y-W, 
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we have 



H{X\W) > H{X\Y), 



by the data processing inequality. Hence, (|7]) and ([8]) yields 



H{X\W) =H{X\Y), 



(8) 



and, from the same argument. 



Inequalities ([6]) thus become 



H{Y\V) = H{Y\X). 



Rx > H{X\Y), 

Ry > H{Y\X), 

Rx + Ry>H{X,Y). 



(9) 



Therefore the achievable region given by Proposition [T] reduces to the Slepian-Wolf rate region. This 
achievable region isn't maximal since the rate region is given by the set of rate pairs that satisfy the only 
two constraints 

Rx > H{X\Y) 
Ry > H{Y\X) 

as shown by Komer and Marton (81. 



IV. Analysis 

Proof of Lemma U\ Suppose X E V E r(G'x|y,z)- For all claims, we show that E(Gy\v,z) ^ 
-E'(G'y|x,z)jj i-e-5 if two nodes are connected in Gy\v,z, then they are also connected in Gy\x,z- The 
opposite direction, E(Gy\x,z) ^ E(Gy\v,z), follows from the definition of generalized conditional char- 
acteristic graph. 

Suppose nodes yi and 1/2 are connected in Gy\v,z- This means that there exist v eV, xi,a;2 E v and 
z E Z such that 

p{xi,yi,z) ■p{x2,y2,z) > 0, 

^E{G) is the set of edges of graph G. 
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and 

f{xi,yi,z) ^ f{x2,y2,z). 

If xi = X2, then yi and y2 are also connected in Gy\x,z according to the definition of conditional 
characteristic graph. We now assume xi ^ X2 and prove Claims a., b., and c. 

a. Since all probabilities are positive we have p{xi, y2, z) > 0, hence 

p{xi,yi,z) ■p{xi,y2,z) > 0, 
and xi,a;2 Eve T{Gx\y,z) yields 

fixi,y2,z) = f{x2,y2,z) ^ f{xi,yi,z), 

which implies that yi and y2 are also connected in Gy\x,z- 

b. T{Gx\Y,z) consists of singletons, so Xi,X2 E v E T{Gx\y,z) yields Xi = X2, and thus yi and ?/2 are 
also connected in Gy\x,z as we showed above. 

c. From independence of X and Y given Z we have 

P{x, y, z) = p{z) ■ p{x\z) ■ p{y\z). 

Hence, since 

p{xi,yi,z) ■p{x2,y2,z) > 0, 

we have 

p{z) -pixilz) ■p{y2\z) > 0, 

i.e. p{xi, 2/2? z) > 0. The rest of the proof is the same as Claim a.. 

■ 
Proof of Proposition [7]- We consider a coding scheme similar to the Berger-Tung rate distortion 
coding scheme [fT6ll . 

Note that rate distortion achievability results do not, in general, provide a direct way for establishing 
achievability results for coding for computing problems. Indeed, for rate distortion problems one usually 
considers average distortion between the source and the reconstruction block whereas in computation 
problems one usually considers the more stringent block distortion criterion [fT9ll , ^. 

We consider a two-step coding procedure; a compression phase followed by a Slepian-Wolf coding [[T4ll 



and 
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of the compressed sequences. 
Pick V and W such that 

V-X-{Y,W, Z) 

{V,X,Z)-Y-W, 

xeve M{r{GxiY,z)) 
Y eW e m(T{Gy\v,z))- 

Assume these random variables together with X,Y,Z are distributed according to some p{v,x,y,w,z). 
For V e T{Gx\Y,z) and w E T{Gy\v,z), define f{v, w, z) to be equal to /(x, y, z) foxxEv and y E w 
such that p(x, y, z) > (Notice that all such (x, y) gives the same /(x, y, z).). Further, for v = (f i, . . . , f „) 
and w = {wi, . . . , Wn) let 

f{v,W,z) = {f{vi,Wi,Zi),...J{Vn,Wn,Zn)}. 



Randomly generate 2"^^^'^) independent sequences 



v^ = (v?,v^,...,v^), 



G{1,2,...,2"^(^^^)}, 



in an i.i.d. manner according to the marginal distribution p{v), and randomly and uniformly bin these 
sequences into 2"^^ bins. Similarly, randomly generate 2"^*^^'^) independent sequences 

t^» = (^«,4^...,t.«),zG{l,2,...,2"^(^^^)}, 

in an i.i.d. manner according to p{w), and randomly and uniformly bin them into 2"^'*' bins. Reveal the 
bin assignments (px and 0y to the encoders and to the decoder. 

Encoding: The X-transmitter finds a sequence v that is jointly robust typical with source sequence x, 
and sends the index of the bin that contains v, i.e., (pxiv). 
Recall that (v,x) are jointly 5-robust typical lfT3l . if 

\PvA^,x) -p{v,x)\ < 5-p{v,x), 
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for all (f,x) G V X A", where 

^ , . def \{i : {vi.Xi) = iv,x)}\ 

n 

Note that if (v,x) are jointly robust typical, then yi,p{vi,Xi) > 0, i.e. V2,Xj G fj. 

The Y-transmitter proceeds similarly sends 0y (it;). If a transmitter doesn't find such an index it declares 
an errors, and if there are more than one indices, the transmitter selects one of them randomly and 
uniformly. 

Decoding: Given z and the index pair (ixjir), declare f(v,w,z) if there exists a unique jointly robust 
typical {v,w,z) such that 4)x{v) = ix and 0y(it7) = iy, and such that f{v,w,z) is defined. Otherwise 
declare an error. 

Probability of Error: There are two types of error. The first type of error occurs when no v's, respectively 
w's, is jointly robust typical with x, respectively with y. The probability of each of these two errors is 
shown to be negligible in [[T3l for n large enough. Hence, the probability of the first type of error can be 
made arbitrary small by taking n large enough. 

The second type of error refers to the Slepian-Wolf coding procedure. By symmetry of the encoding and 
decoding procedures, the probability of error of the Slepian-Wolf coding procedure, averaged over sources 
outcomes, over v's and w's, and over the binning assignments, is the same as the average error probability 
conditioned on the transmitters selecting V^^^ and 1^*^^\ Note that whenever {V,\V) = (V^^\W^^'^), 
there is no error, i.e., f{X,Y,Z) = f(V^^\W^^\ Z) by definition of robust typicality and by the 
definitions of V and W. We now compute the probability of the event {V, W) ^ {V^^\ W^^^). 

Define event ^{i,j) as 

</.^(V«) = MV^''>),(Py{W^''>) = (priW^'^)} 
where T denotes the (5-) jointly robust typical set with respect to distribution p{v, w, z). The probability 
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of the second type of error is upper bounded as 



P((V,iy)^(V(i),iy«)) 

= P(£^(l,l)U(U(,,)^(i,i)£(z,j))) 
<P(r(l,l)) + 5^P(£(^,l)) 






+ 5^P(£(l,j)) + E^(S(^'^'))- (10) 



According to the properties of jointly robust typical sequences [[T3l . we have 



P(£^(l,l))<e 
P(£(^,l))<2-"(«^+^(^'^'^))+£ 
P(£(l,j))<2-"(^^+'('''^^^)^+£ 
P(£(«,j))<2-"(^^+''^+'(''''^)+'(''''^'^)) +£ (11) 



for any e > and n large enough. Hence, from (10) and (11 1 



_|_ 2nI(y\V/)2-n{RY+I{V,Z;W)) 

_l_ 2n(^{V;X)+/(y;VF))2-"-(i?x+fiy+-f(V;Ty)+/(y,VK;Z)) 

The error probability of the second type is thus negligible whenever 

Rx>I{V-X)-I{V-W,Z) 

= H{y\w,z)-H{y\x) 

= I{V-X\W,Z), 
Ry > I{Y; W) - I{V, Z; W) 

= H{W\V,Z)-H{W\Y) 
= I{Y;W\V,Z), 
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Rx + Ry> I{V] X) + I{Y- W) - I{V- W) - I{V, W- Z) 

= {H{V\Z) - H{V\X)) + {H{W\V, Z) - H{W\Y)) 

= I{V-X\Z) + I{Y-W\V,Z), (12) 

where (a) and (h) follow from the Markov chains V — X — {W, Z) and {V, Z) — Y — W, respectively, 
and where (c) follows from the Markov chains V — X — Z and (V, Z) — Y — W . 
We end the proof by showing the equivalence of the conditions 

X eV eM{T{Gx\Y,z)), 

Y ew e m{V{Gy\v,z)), 

and 

Y eW e M{T{Gy\x,z)), 

XeV e M{T{Gx\w,z)) . 
We prove one direction, the proof for the other direction is the same. Assume 

X eV eM{T{Gx\Y,z)), 

Y eW e m{T{Gy\v,z)), 

holds. To prove that W G M{T{Gy\x,z)), we show that for any w G Wn yi,y2 & w, x & X, and z e Z 
such that 

p{x,yi,z) ■p{x,y2,z) > 0, 

we have 

f{x,yuz) = f{x,y2,z). 

Since P(X E V) = 1, there exists v G V such that p{v\x) > 0, hence, by definition of generalized 
conditional characteristic graph Gy\v,z, we havep^ 

f{x,yi,z) = fxiv,yi,z) = fxiv,y2,z) = f{x,y2,z). 

' W is the alphabet of random variable W. 

'" fx{v,y,z) is defined in the same way as fY{x,w, z) in Definition 5 
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To prove that V E M{T{Gx\w,z)), note that for any w G W, yi,y2 E w, v E V, xi,a;2 E X, and z E Z 
such that p{x,yi, z) ■ p{x,y2, z) > 0, 
i) if yi = y2 = y, then /(xi, y, z) = f{x2, y, z), since V E M{T{Gx\y,z))- 
ii) if yi^y2, then 

/(xi, 2/1, z) = /x(t;, 2/1, 2;) = /x(w, 2/2, z) = /(X2, 2/2, 2;), 

since W E M{T{Gy\v,z))- 
Hence V E M{T{Gx\w,z)) for both cases i) and ii). ■ 

Proof of Proposition^ Assume the received messages from the transmitters are Gx = fxp^) and 
Gy = (Py{Y). Since the rate pair (Rx,Ry) is achievable, there exist a decoding function 

^{Gx,Gy,Z) = IJ, 

such that 

P(U7^/(X,Y,Z)) ^0 asn^oo. (13) 

Also, since / is partially invertible with respect to X, i.e X is a function of f{X, Y, Z) and Z, there exist 
a function 

(/(Cx,Cy,Z) = (Xi,..,X„)=X, 



such that 



Define the distortion measures 



P(X 7^ X) ^ as n ^ 00. 



0, if X = X 
dx{x,x) = { (14) 



1, otherwise. 



0, if M = f{x,y, z) 
dY{x,y,z,u) = { (15) 

1, otherwise. 



Since 

P(U ^ /(X, Y, Z)) > P(f/, ^ /(X„ F„ Z,)), 
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we have 

1 '^ 
P(U 7^ /(X,Y,Z)) > - Vp(f/, ^ /(X„F„Z,)) 

n ^-^ 

= rfy(X,Y,Z,U) (16) 

From ( [T3] ) and ([16]), (iy(X, Y, Z, T) — )■ as n — )• oo. With the same argument one shows that 

1 " 
dx{X, X) = - ^ dx{Xi, Xi)^Oasn^ oo. 

j=i 
According to ['2', Theorem 1], it follows that there exist random variable W and functions gi{X, W, Z) 
and g2{X, W, Z) such thaf] 

Edx{X,g,{X,W',Z)) = 0, 

EdY{X,Y,Z,g2{X,W',Z)) = 0, 

(X, z) - r - ly' , 

and 

Rx>H{X\W',Z), 
Ry>I{Y;W'\X,Z), 
Rx + Ry>H{X\Z) + I(Y;W'\X,Z). (17) 

Notice that since the distortion E,dY(X,Y, Z, g2{X,W' , Z)) is equal to zero, for any (x,yi,w', z) and 
{x, 1/2, w',z) that satisfy 

p{x, 1/1, w, z) ■ p{x, 1/2, w, z) > 0, 

we should have 

/(x, 2/1,2;) = 5(2(0;, w, z) = /(x, 2/2, 2;)- 

This according to Lemma |2] in the Appendix, is equivalent to 

H{f{X,Y,Z)\X,W',Z) = 0. 

"There is one caveat in applying the converse arguments of |2, Theorem 1]. In our case we need the distortion measures to be defined 
over functions of the sources. More precisely, we need Hamming distortion for source X and Hamming distortion for a function defined 
over both sources {X,Y) and side information Z. However, it is straightforward to extend the converse of |2, Theorem 1] to handle this 
setting (same as |20| which shows that Wyner and Ziv's result 1191 can be extended to the case where the distortion measure is defined over 
a function of the source and the side information.). 
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Since H{f{X, Y, Z)\X, W, Z) = and since (X, Z)-Y-W' holds, using Corollary |4|in the Appendix 
yields 

Y e Sy{W') e M{r{GY\x,z)), 

and 

{X,Z)-Y-Sy{W'). 

Also, by definition of Sy{W') (Definition |6]) we have 



HiX\W',Z)=HiX\SYiW'),Z), 
I{Y; W'\X, Z) = I{Y; Sy{W')\X, Z), 
H{X\Z) + I{Y- W'\X, Z) = H{X\Z) + I{Y- Sy{W')\X, Z). (18) 



Taking W = Sy^W) and using pTj ) and ( [T8| ) completes the proof. ■ 

Alternative Proof of Theorem [7] without cardinality bound: 

We present a proof that establishes Theorem [T] except for the cardinality bound |>V| < |3^| + 2 (see 
Proposition |2]), using the canonical theory developped in ||6l. Suppose there is a third transmitter who 
knows U = f(X,Y,Z) and sends some information with rate Ru to the receiver. For this problem, the 
rate region is the set of achievable rate pairs (Rx, Ry, Ru)- By intersecting this rate region with Ru = 0, 
we obtain the rate region for our two transmitter computation problem. 

Consider the three transmitter setting as above. Since f{X, Y, Z) is partially invertible, we can equiva- 
lently assume that the goal for the receiver is to obtain (X, U). This corresponds to (M, J, L) = (3, 2, 0) in 
the Jana-Blahut notation, and, using ^ Theorem 6], the rate region is given by the set of all [Rx-, Ry-, Ru) 
such that 

Rx >H{X\W',Z,U) 
Ry>I{Y;W'\X,Z,U) 

Ru>H{U\X,W',Z) (19) 

Rx + Ry> I{X, Y; X, W'\Z, U) 
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Rx + Ru> I{X, U; X, U\W', Z) = H{X\W', Z) + H{U\X, W\ Z) 
Ry + Ru> I{Y, U; W, U\X, Z) = I{Y- W'\X, Z) + I{U] W'\X, F, Z) + H{U\X, W, Z) 
Rx + Ry + Ru> HX, Y, U- X, W, U\Z) 

= I{X, Y, U- X, W'\Z) + I{X, Y, U- U\X, W, Z) 

= I{X, Y; X, W'\Z) + I{U; X, W'\X, Y, Z) + H{U\X, W, Z) (20) 



for some W that satisfies 



Due to this Markov chain we have 



(X, z, f/) - r - W. 



I{U; W'\X, Y, Z) = I{U; X, W'\X, Y, Z) = 0. 



(21) 



Intersecting with Ru = 0, from ( |T9| ) we derive that 



H{U\X,W',Z) = 0. 



(22) 



Hence, using (21) and (22), the last three inequalities in (20) become 



Rx + 0> H{X\W', Z) > H{X\W', Z, U) 
Ry + 0>I{Y;W'\X,Z) 

= H{W'\X, Z) - R{^'\X, F, Z) 

= H{W'\X, Z) - H{W'\X, Y, Z, U) 

> H{W'\X, Z, U) - H{W'\X, Y, Z, U) = I{Y; W'\X, Z, U) 
Rx + Ry + 0> I{X, Y; X, W'\Z) = H{X\Z) + I{Y; W'\X, Z) 

> H{X\Z, U) + I{Y- W'\X, Z, U) = I{X, Y; X, W'\Z, U) , 



which also imply the first three inequalities in (20). 



Therefore, when the three last inequalities of ( |20l ) hold and when H{U\X, W , Z) = 0, all other 
inequalities are satisfied. The rate region for the two transmitter problem thus becomes the set of rate 
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pairs {Rx,Ry) that satisfy 

Rx >H{X\W',Z) 
Ry>I{Y;W'\X,Z) 
Rx + Ry>I{X,Y-X,W'\Z) 

for some W that satisfies (X, Z)-Y -W and H{U\X, W, Z) = 0. Now, according to Corollary |4| we 
have 

Y e Sy{w') e M{r{GY\x,z)), 

and 

{X,Z)-Y-Sy{W"). 

Taking W = Sy{W') completes the proof. ■ 

Proof of Proposition ^ Assume the received messages from transmitters are Cx = ^x (X) and 
Cy = lpy(Y). Since the rate pair (Rx,Ry) is achievable, there exists a decoding function 

^{Cx,Cy,Z) = IJ, 

such that 

P(U ^ /(X, Y, Z)) ^ as n ^ CX). 

Define the distortion function 

0, if M = f{x,y,z), 
d{x,y,z,u) = < (23) 

I 1, otherwise. 
From a similar argument as in the proof of Proposition [2[ we have 

d(X, Y, Z, U) ^ as n ^ oo. 

Hence assuming the same distortion for both sources, {Rx,Ry) ^ -Rd(0, 0)q] and, according to llT6l 
Theorem 5.1], there exist random variables V and W and a function g{V, W, Z) such that 

m{x,Y,z,g{y,w,z)) = Q, 

^^Rd{Dx,Dy) is the rate distortion region for correlated source X and Y with distortion criteria Dx and Dy, respectively. 
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V -X-{Y,Z) 

and 

Rx>I{X,Y-V\W,Z) 
Ry >H{X,Y;W\V,Z) 
Rx + Ry>HX,Y-V,W\Z). 

It remains to show that {V, W) satisfy H{f{X, Y, Z)\V, W, Z) = 0. 

Since the distortion is equal to 0, for any {v,xi,yi,w, z) and (f ,X2,i/2, w, -z) that satisfy 

p{v, xi,yi, w, z) ■ p{v, X2, y2, w, z) > 0, 

we should have 

f{Xi, yi,z)= g{v, W, Z) = /(X2, 2/2, z). 

This implies that H{f{X, F, Z) \V,W, Z) = by Lemma ^ U 
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Appendix 

In this appendix we show random variables V and W that satisfy H{f{X,Y,Z)\V,W,Z) — 0, in certain cases, can be 
characterized as random variables defined over the multiset of independent sets of the (generalized) conditional characteristic 
graph. 

Lemma 2. Let 

{V,X,Y,W,Z) eVxX xyxWxZ, 

be distributed according to p{v,x,y,w, z). The two following statements are equivalent: 

a) H{J{X,Y,Z)\V,W,Z)^Q. 

b) For all 

{xi,yi,z), {x2, y2,z) e X xy X Z, 

{v,w) eV xW, 

that satisfy 

p{v,xi,yi,w,z) ■p{v,x2,y2,w,z) > 0, 



if and only if they satisfy 
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we have 

f{xi,yi,z) = J{x2,y2,z). 

Proof: For showing the equivalence notice that H{f{X, Y, Z)\V, W, Z) = if and only if there exist a function g{v, w, z) 
such that 

f{X,Y,Z)=g{V,W,Z), 

which is the same as b. ■ 

Lemma 3. Given {X, Y, Z) ~ p[x, y, z) and f{X, Y, Z), {V, W) satisfy 

H{f{X,Y,Z)\V,W,Z)^0, 

and 

V-X -{Y,W, Z) 

{V,X,Z)-Y-W, 

X^Sx{V)^M{T{Gx\Y.z)) 
Y (^ Sy{W) (^M{V{Gy\s.(v),z)). 

Sx{V)-X-{Y,Sy{W),Z) 

{Sx{V),X,Z)-Y-Sy{W). 

In a special case of above Lemma for V ~ X wt derive the following lemma. 
Corollary 4. Given {X, Y, Z) ^ p{x, y, z) and f{X, Y, Z), W satisfies 

H{f{X,Y,Z)\X,W,Z)=0, 

and 

(X, Z)-Y-W, 

if and only if 

YeSY{W)eMiTiGYix.z)). 

and 

{X,Z)-Y-Sy{W). 



and 
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Proof of Lemma [ij The lemma is a direct consequence of the following four claims, proved thereafter: 
a. X e Sx{V) and Y e Sy{W) always hold. 
b. 

V -X^{Y,W, Z) 

{V,X,Z)-Y^W, 

if and only if 

Sx{V)-X-{Y,Sy{W),Z) 

{Sx{V),X,Z)-Y^Sy{W). 

Further, when these Markov chains hold. Claims c. and d. below hold: 

c. {V, W) satisfy 

H[f{X,Y,Z)\V,W,Z)=0, 

if and only if for all a;i, 0:2 G Sx{v) and i/i, j/2 G Sy{w) such that 

p{xi,yi,z) ■p{x2,y2,z) > 0, 

it holds that 

f{xi,yi,z) = f{x2,y2,z). 

d. 

Sx{V)&M{T{GxiYz)) 

SY{W)eM{T{GY\s.(v),z)), 
if and only if for all xi, a;2 G Sx{v) and yi,y2 G S'y(w) that 

p{xi,yi,z) ■p{x2,y2,z) > 0, 

it holds that 

f{xi,yi,z) = f{x2,y2,z). 

Claims a. and b. come directly from the Definition |6] 
c. Notice that due to the Markov chains 

V -X-{Y,W, Z) 

{V,X,Z)-Y~W, 
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we can write 

p{v, X, y, w, z) = p{x, y, z) ■ p{v\x) ■ p{w\y). 

Hence 

p{v,xi,yi,w,z)-{v,X2,y2,w,z) > 0, 

if and only if 

p{xi,yi,z) ■p{x2,y2,z) -piv^xi) ■p{v,X2) ■p{yi,w) ■ p{y2,w) > 0, 

which is equivalent to constraints 

a;i,X2 e Sx{v), 

yi,2/2 € Sy(w), 

p{xi,yi,z) ■p{x2,y2,z) > 0. 

Using the Lemma [2] completes the proof of the claim. 
d. The proof for converse part is straightforward from the definition of (generalized) conditional characteristic graph. 
To prove the direct part, for xi, a;2 G Sx{v), yi,y2 G Sy{w) such that 

p{xi,yi,z) ■p{x2,y2,z) > 0, 



we show that 



i. If 2/1 = 2/2, then since 



f{xi,yi,z) = f{x2,y2,z). 



Sxiv)eMiTiGxiY,z)), 

for a;i,a;2 e Sxiv), f{xi,yi,z) — f{x2,y2,z) (The same argument is valid ifxi — X2-). 
ii. If xi 7^ X2, 2/1 7^ 2/2j then from 

Sy{w) &M{T{Gy\s.(v).z)), 

we hav4H] 

f{xi,yi,z) = fx{Sx{v),yi,z) = fx{Sx{v),y2,z) = f{x2,y2,z). 



" fx{v,y,z) is defined in the same way as fY{x,w, z) in Definition 5 



